Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available March 26, 2026
-
We consider the problem of determining the manifold $$n$$-widths of Sobolev and Besov spaces with error measured in the $$L_p$$-norm. The manifold widths control how efficiently these spaces can be approximated by general non-linear parametric methods with the restriction that the parameter selection and parameterization maps must be continuous. Existing upper and lower bounds only match when the Sobolev or Besov smoothness index $$q$$ satisfies $$q\leq p$$ or $$1 \leq p \leq 2$$. We close this gap and obtain sharp lower bounds for all $$1 \leq p,q \leq \infty$$ for which a compact embedding holds. A key part of our analysis is to determine the exact value of the manifold widths of finite dimensional $$\ell^M_q$$-balls in the $$\ell_p$$-norm when $$p\leq q$$. Although this result is not new, we provide a new proof and apply it to lower bounding the manifold widths of Sobolev and Besov spaces. Our results show that the Bernstein widths, which are typically used to lower bound the manifold widths, decay asymptotically faster than the manifold widths in many cases.more » « lessFree, publicly-accessible full text available December 1, 2025
-
Kolmogorov-Arnold Networks (KAN) \cite{liu2024kan} were very recently proposed as a potential alternative to the prevalent architectural backbone of many deep learning models, the multi-layer perceptron (MLP). KANs have seen success in various tasks of AI for science, with their empirical efficiency and accuracy demonstrated in function regression, PDE solving, and many more scientific problems. In this article, we revisit the comparison of KANs and MLPs, with emphasis on a theoretical perspective. On the one hand, we compare the representation and approximation capabilities of KANs and MLPs. We establish that MLPs can be represented using KANs of a comparable size. This shows that the approximation and representation capabilities of KANs are at least as good as MLPs. Conversely, we show that KANs can be represented using MLPs, but that in this representation the number of parameters increases by a factor of the KAN grid size. This suggests that KANs with a large grid size may be more efficient than MLPs at approximating certain functions. On the other hand, from the perspective of learning and optimization, we study the spectral bias of KANs compared with MLPs. We demonstrate that KANs are less biased toward low frequencies than MLPs. We highlight that the multi-level learning feature specific to KANs, i.e. grid extension of splines, improves the learning process for high-frequency components. Detailed comparisons with different choices of depth, width, and grid sizes of KANs are made, shedding some light on how to choose the hyperparameters in practice.more » « lessFree, publicly-accessible full text available January 22, 2026
-
We present a generalization of Nesterov's accelerated gradient descent algorithm. Our algorithm (AGNES) provably achieves acceleration for smooth convex and strongly convex minimization tasks with noisy gradient estimates if the noise intensity is proportional to the magnitude of the gradient at every point. Nesterov's method converges at an accelerated rate if the constant of proportionality is below 1, while AGNES accommodates any signal-to-noise ratio. The noise model is motivated by applications in overparametrized machine learning. AGNES requires only two parameters in convex and three in strongly convex minimization tasks, improving on existing methods. We further provide clear geometric interpretations and heuristics for the choice of parameters.more » « lessFree, publicly-accessible full text available December 15, 2025
-
Free, publicly-accessible full text available January 1, 2026
-
Canonicalization provides an architecture-agnostic method for enforcing equivariance, with generalizations such as frame-averaging recently gaining prominence as a lightweight and flexible alternative to equivariant architectures. Recent works have found an empirical benefit to using probabilistic frames instead, which learn weighted distributions over group elements. In this work, we provide strong theoretical justification for this phenomenon: for commonly-used groups, there is no efficiently computable choice of frame that preserves continuity of the function being averaged. In other words, unweighted frame-averaging can turn a smooth, non-symmetric function into a discontinuous, symmetric function. To address this fundamental robustness problem, we formally define and construct weighted frames, which provably preserve continuity, and demonstrate their utility by constructing efficient and continuous weighted frames for the actions of SO(d), O(d), and Sn on point clouds.more » « less
-
We present convergence estimates of two types of greedy algorithms in terms of the entropy numbers of underlying compact sets. In the first part, we measure the error of a standard greedy reduced basis method for parametric PDEs by the entropy numbers of the solution manifold in Banach spaces. This contrasts with the classical analysis based on the Kolmogorov [Formula: see text]-widths and enables us to obtain direct comparisons between the algorithm error and the entropy numbers, where the multiplicative constants are explicit and simple. The entropy-based convergence estimate is sharp and improves upon the classical width-based analysis of reduced basis methods for elliptic model problems. In the second part, we derive a novel and simple convergence analysis of the classical orthogonal greedy algorithm for nonlinear dictionary approximation using the entropy numbers of the symmetric convex hull of the dictionary. This also improves upon existing results by giving a direct comparison between the algorithm error and the entropy numbers.more » « less
-
Sriperumbudur, Bharath (Ed.)
An official website of the United States government

Full Text Available